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Abstract 

We study tail probabilities via some Gaussian approximations. Our results make refine- 
ments to large deviation theory. The proof builds on classical results by Bahadur and 
Rao. Binomial distributions and their tail probabilities are discussed in more detail. 
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1. Introduction 

Let Xx, . . . , X n be i.i.d. random variables such that the moment generating function 
E [exp (f3Xi)] is finite in a neighborhood of the origin. For fixed fi > E [X\], the aim of 
this paper is to approximate the tail distribution: 

If /i is close to the mean of X\ one would usually approximate P n ,n by a tail probability 
of a Gaussian random variable. If /x is far from the mean of X\ the tail probability can be 
estimated using large deviation theory. According to the Sanov theorem the probability 
that the deviation from the mean is as large as jj, is of the order exp {—nD) where D is a 
constant. Bahadur and Rao [2] improved the estimate of this large deviation probability, 
and the goal of this paper is to extend the Gaussian tail approximations into situations 
where one normally uses large deviation techniques. 

Let (j> and $ be the density function and the distribution function of the standard 
Gaussian, respectively. Let Pq denote a probability measure describing the distribution 
of a random variable X. Consider the 1-dimensional exponential family (Pp) based on Pq 
and given by 

dPp . . _ exp (J3 ■ x) 
dP [X) ~ Z(0) 
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where the denominator is the moment generating function (partition function) given by 



The mean value of Pp is 



Z{ft) = J exp (ft ■ x) &P x = E [e px ] . 
2' (13) 



Z(ft) 
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and the range of this function will be denoted M and will be called the mean value range 
of the exponential family. 

For fi in interior of M the maximum likelihood estimate ft (p) equals the ft such that 
the mean value of Pp equals /x, which in this case is the average of the i.i.d. samples. 
Put P M = Pfitfjs ■ An equivalent definition of ft (/x) can be as the solution of the equation 
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Let V (n) denote the variance of P M . 
Information divergence is given by 



D(P»\\P ) = J ln (^w) dP ^ x - 



We see that 
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2. Approximation of tail distributions for non- lattice valued variables 

Introduce the notation 

/i* := sup{^ > (mq;D (P M ||Po) < oo} = supM. 

Bahadur and Rao 0] proved a refined version of the large deviation bound, but some 
aspects of their result dates back to Cramer Q and part of it was proved by a different 
method by Blackwell and Hodges [3|. For fi* > /j, > /l«o, the Sanov theorem implies that 



mP{i£Li^>/4 



D(P f± \\P ) for n -> oo. 



Bahadur and Rao Q verified the following improvement of the Sanov theorem 
P J -^TXi > J- = ' ' ^ „ 1 | 1 + O | — = j i Ibr „ - x 



= exp(-^(PTO) / /I 
J (27rnV(ix)) 1/2 /3(//) V \V™ 



(3) 



for non lattice random variables. 

We will write D (/i) as short for P> (P^|| P ) . 
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Theorem 1. For /i* > jj, > fio, one has that 
P 

where 



= ^(-n 1 / 2 (2D ( jU — — 



1/2 



1 + 



, (2D(m)) 1/2 
111 V(M) 1/2 /3(M) 



/3(A*) 

Proof. The c M defined by ([5]) satisfies the equation 



for n — > oo, (4) 



(5) 
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The tail probabilities of the standard Gaussian satisfy 
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< $(-z) < 
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for z > 0, (cf. Feller 0, p. 179]), which implies that 



exp(-nD 
(27m) 1 / 2 {2D^- C it )) 1 ' 2 



= $ — ra 
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and so 



exp(-»D (/.-*)) 



1/2 



(27rn) 1/2 (2D( A t)) 1 /2 
Because of ([T|) and ([2]), the derivative can be calculated as 

-f £>(a*)= / 8(m), 
d/i 

leading to the following Taylor expansion 



1 + 
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n 



D<ji)-P {ft) 



O 



Thus, 



exp(-n£> (/*-%)) = 
(2t™) 1/2 (2£( M )) 1/2 



(-n (z?( M )~/3( M ).^ + 0(^) 
(27m) 1/2 (2D(/z)) 1 /2 
exp (-nD(n) + /3(m)c„ + O (A) 
(2t™) 1/2 (2D( A i)) 1 /2 

exp (-nD(fj)) tfJW ( 

1/2 — 1 + 

(27rn) 1/2 (2D( A< )) 1 /2 
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(6) 



(7) 



(8) 



According to (|3|) we also have 

P(-E*i>4= CXp( -^ )} U + 0(")) farn-oo, (9) 

therefore applying ©, (|7|), © and © the proof of Theorem Q] is complete. □ 

Remark 1. If in the approximation Cp is replaced by any other constant c then the ratio 
of the two approximations tends to a number, which is not equal to 1 : 



exp(-n£ {n- I)) 

exp (Cfj, -c)+0 



n / \ n 



1 

~ exp \B (//) ■ (c p - c)J 
^ 1. 

Remark 2. //"Xi /ias a density with respect to the Lebesgue measure then Bahadur and 
Rao l£] proved the stronger result that 



exp(-n£»(P^||Po)) ( l + Q (l 



{2imV (/i)) 1/2 /3(^) V V" 



Using this result we get the following theorem: If X\ has a density with respect to the 
Lebesgue measure then 



p{ig-.^}-(-" 1/2 ( 2D ("^))" 2 )( 1 + (^ 



for n — > oo, 



for any jj,* > /i > //q- 



3. Results for lattice valued variables 



Now assume that Xi,X2, ... is a sequence of i.i.d. random variables with values in 
a lattice of the type {kd + 5 \ k 6 Z} . For such a sequence Bahadur and Rao Q proved 
that 

for any n such that P {— Y^i=i = ^} > 0- We note that the result ([3]) for non-lattice 
variables can be considered as a limiting version of (flU)) for small d > because 

1 - exp (-d/3) 

> p for a — > 0. 
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Theorem 2. Assume that X\ has values in the lattice {kd + S \ k G Z} and that /i* > 
/i > hq. Then for any n such that P {— X)"=i -^j = A 1 } > one has 



p J - V v ..„>. •!• ( (2D („ - I)) " 2 ) (l + o (i'] ) ./;»■ - x 



where 

ln (2D( M ))y 2 
\-"(^) i/2 i ~° x p(~' i ' 9 (' j )y 



Proof. If Xi is lattice valued then the proof of Theorem Q] can be modified by replacing 
ft (p) by cxp(^ ^ ^ e appropriate places throughout the proof. There is no 

modification in the use of a Taylor expansion. □ 



Xi 



We now turn to the special case, where X±, . . . , X n are i.i.d. Bernoulli random vari- 
ables with 

1 with probability p, 
with probability 1 — p. 

In this case d = 1, and Xa=i-^Q ^ s a binomial random variable. For various 

refinements of (fT0|) . see Bahadur yj, Littlewood Q and McKay [§]. 

Corollary 1. Put 

//„ := |~n//|/n. 

Then for 1 > /j, > p one has that 



pjiE-^f*(-..'"(»(- f f))" , )( 1 ^(; 

where 

D(p) = D(p\\p) = M ln H + (1 - M ) ln ^— ^ 
P 1 -P 

and 



/or n — > oo, 



1 ln 



(^(1-P) 



" 2 21n4^4 
Proof. Because of the definition of /i n , 

i r i a 



p {=^ a "}- p {=s- 



and the condition P { — X^=i — Mn} > is satisfied, and so Theorem [2] implies that 



i/2\ / /r 

1 + O — ) ) for n — >■ oo. 



We have to evaluate c M . The distribution Pp has 



Pp (X t = 1) = 



pe h 



1 — p + pe@ 



which is also the mean of Pp. The equation 



pe h 



is equivalent to 



implying that 



1 — p + pe@ 
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The variance function is 
Thus, we have 



M (1 - P) A* (1 - P) ' 
V (/i)= M (I-/;)- 
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□ 



Remark 3. For p = 1/2, 0.5 < c M < 0.534 and Table\j\ shows some numerical values 
for Cfl w 0.5 + (/i-0.5)/12. 





0.6 


0.65 


0.7 


0.75 


0.8 


0.85 


0.9 




0.508 


0.512 


0.516 


0.520 


0.524 


0.528 


0.532 



Table 1: Numerical values 



4. Discussion 



As discussed by Reiczigel, Rejto and Tusnady [10( and by Harremoes and Tusnady 
there are some strong indications that these asymptotic results can be strengthened 

6 



to sharp inequalities. Such sharp inequalities would imply the present asymptotic results 
as corollaries. We hope that the asymptotics presented here can help in proving the 
conjectured sharp inequalities. Related sharp inequalities have been discussed by Leon 
and Perron Q and Talagrand . Numerical experiments have also shown that our tail 
estimates are useful even for small values of n. 
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